Batch-Mode Active Learning via Error Bound Minimization

نویسندگان

  • Quanquan Gu
  • Tong Zhang
  • Jiawei Han
چکیده

Active learning has been proven to be quite effective in reducing the human labeling efforts by actively selecting the most informative examples to label. In this paper, we present a batch-mode active learning method based on logistic regression. Our key motivation is an out-of-sample bound on the estimation error of class distribution in logistic regression conditioned on any fixed training sample. It is different from a typical PACstyle passive learning error bound, that relies on the i.i.d. assumption of example-label pairs. In addition, it does not contain the class labels of the training sample. Therefore, it can be immediately used to design an active learning algorithm by minimizing this bound iteratively. We also discuss the connections between the proposed method and some existing active learning approaches. Experiments on benchmark UCI datasets and text datasets demonstrate that the proposed method outperforms the state-of-the-art active learning methods significantly.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Dynamic Batch Mode Active Learning via L1 Regularization

We propose a method for dynamic batch mode active learning where the batch size and selection criteria are integrated into a single formulation.

متن کامل

An extension of the Chernoff-based transformation matrix estimation method for on-line learning in Bayesian binary hypothesis tests

In a previous paper [8] we have proposed a method to improve the classification between two classes in a new transformed space using the Chernoff similarity measure. The key idea is to estimate a transformation matrix such that the overlap between the pdf associated to the competing classes is minimum thus leading to a minimization of the classification error. Starting from a surrogate cost fun...

متن کامل

Convex Batch Mode Active Sampling via alpha-relative Pearson Divergence

Active learning is a machine learning technique that trains a classifier after selecting a subset from an unlabeled dataset for labeling and using the selected data for training. Recently, batch mode active learning, which selects a batch of samples to label in parallel, has attracted a lot of attention. Its challenge lies in the choice of criteria used for guiding the search of the optimal bat...

متن کامل

Active Instance Sampling via Matrix Partition

Recently, batch-mode active learning has attracted a lot of attention. In this paper, we propose a novel batch-mode active learning approach that selects a batch of queries in each iteration by maximizing a natural mutual information criterion between the labeled and unlabeled instances. By employing a Gaussian process framework, this mutual information based instance selection problem can be f...

متن کامل

Counterfactual Risk Minimization

We develop a learning principle and an efficient algorithm for batch learning from logged bandit feedback. Unlike in supervised learning, where the algorithm receives training examples (xi, y ∗ i ) with annotated correct labels y ∗ i , bandit feedback merely provides a cardinal reward δi ∈ R for the prediction yi that the logging system made for context xi. Such bandit feedback is ubiquitous in...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014